Facebook Develops Machine Translation System for 100 Languages
2020-10-22
LRC
TXT
大字
小字
滚动
全页
1Facebook has developed the first machine learning model that can translate between any two of 100 languages without going into English first.
2Facebook says the new multilingual machine translation model was created to help its more than two billion users worldwide.
3The company is still testing the translation system - which it calls M2M-100 - and hopes to add it to different products in the future.
4The social media service says it has made the system open source -- meaning its computer code will be freely available for others to copy or change.
5Angela Fan, a research assistant at Facebook, explained the new machine translation model this week on one of the company's websites.
6She said its development represented a "milestone" in progress after years of "foundational work in machine translation."
7Fan said the model produces better results than other machine learning systems that depend on English to help in the translation process.
8The other systems use it as an intermediate step -- like a bridge -- to translate between two non-English languages.
9One example would be a translation from Chinese to French.
10Fan noted that many machine translation models begin by translating from Chinese to English first, and then from English to French.
11This is done "because English training data is the most widely available," she said.
12But such a method can lead to mistakes in translation.
13"Our model directly trains on Chinese to French data to better preserve meaning," Fan said.
14Facebook said the system outperformed English-centered systems in a widely used system that uses data to measure the quality of machine translations.
15Facebook says about two-thirds of its users communicate in a language other than English.
16The company already carries out an average of 20 billion translations every day on Facebook's News Feed.
17But it faces a huge test with many users publishing massive amounts of content in more than 160 languages.
18The development team trained, or directed, the new model on a data set of 7.5 billion sentence pairs for 100 languages.
19In addition, the system was trained on a total of 2,200 language directions.
20Facebook said this is 10 times the number on the best machine translation models in the past.
21One difficulty the team faced was trying to develop an effective machine translation system for language combinations that are not widely used.
22Facebook calls these "low-resource languages."
23The data used to create the new model was collected from content available on the internet.
24But there is limited internet data on low-resource languages.
25To deal with this problem, Facebook said it used a method called back-translation.
26This method can create "synthetic translations" to increase the amount of data used to train on low-resource languages.
27For now, the company says, it plans to continue exploring new language research methods while working to improve the new model.
28No date has been set for launching the translation system on Facebook.
29But Angela Fan said the new system marks an important step for Facebook, especially for the times we live in.
30"Breaking language barriers through machine language translation is one of the most important ways to bring people together, provide authoritative information on COVID-19, and keep them safe from harmful content," she said.
31I'm Bryan Lynn.
1Facebook has developed the first machine learning model that can translate between any two of 100 languages without going into English first. 2Facebook says the new multilingual machine translation model was created to help its more than two billion users worldwide. The company is still testing the translation system - which it calls M2M-100 - and hopes to add it to different products in the future. 3The social media service says it has made the system open source -- meaning its computer code will be freely available for others to copy or change. 4Angela Fan, a research assistant at Facebook, explained the new machine translation model this week on one of the company's websites. She said its development represented a "milestone" in progress after years of "foundational work in machine translation." 5Fan said the model produces better results than other machine learning systems that depend on English to help in the translation process. The other systems use it as an intermediate step -- like a bridge -- to translate between two non-English languages. 6One example would be a translation from Chinese to French. Fan noted that many machine translation models begin by translating from Chinese to English first, and then from English to French. This is done "because English training data is the most widely available," she said. But such a method can lead to mistakes in translation. 7"Our model directly trains on Chinese to French data to better preserve meaning," Fan said. Facebook said the system outperformed English-centered systems in a widely used system that uses data to measure the quality of machine translations. 8Facebook says about two-thirds of its users communicate in a language other than English. The company already carries out an average of 20 billion translations every day on Facebook's News Feed. But it faces a huge test with many users publishing massive amounts of content in more than 160 languages. 9The development team trained, or directed, the new model on a data set of 7.5 billion sentence pairs for 100 languages. In addition, the system was trained on a total of 2,200 language directions. Facebook said this is 10 times the number on the best machine translation models in the past. 10One difficulty the team faced was trying to develop an effective machine translation system for language combinations that are not widely used. Facebook calls these "low-resource languages." The data used to create the new model was collected from content available on the internet. But there is limited internet data on low-resource languages. 11To deal with this problem, Facebook said it used a method called back-translation. This method can create "synthetic translations" to increase the amount of data used to train on low-resource languages. 12For now, the company says, it plans to continue exploring new language research methods while working to improve the new model. No date has been set for launching the translation system on Facebook. 13But Angela Fan said the new system marks an important step for Facebook, especially for the times we live in. "Breaking language barriers through machine language translation is one of the most important ways to bring people together, provide authoritative information on COVID-19, and keep them safe from harmful content," she said. 14I'm Bryan Lynn. 15Bryan Lynn wrote this story for VOA Learning English, based on reports from Facebook and Agence France-Presse. George Grow was the editor. 16We want to hear from you. Write to us in the Comments section, and visit our Facebook page. 17_______________________________________________________________ 18Words in This Story 19translate - v. change written or spoken words from one language to another 20code - n. a set of rules used to instruct computers how to behave or do things 21milestone - n. an event that reaches never before seen levels 22intermediate - adj. between two different stages in a process 23preserve - v. keep something the same or prevent it from being damaged of destroyed 24pair - n. two things that look the same and are used together 25content - n. information contained in a piece of writing, a speech, a movie or on the internet 26synthetic - adj. not made from substances or in the usual way 27authoritative - adj. respected and considered to be accurate